Learning Optimal Policies in Markov Decision Processes with Value Function Discovery?
نویسندگان
چکیده
منابع مشابه
Discovery of Structured Optimal Policies in Markov Decision Processes
In this chapter we continue work on vfd, the novel method for discovery of relative value functions for Markov Decision Processes that we introduced in Chapter 6. vfd discovers algebraic descriptions of relative value functions using ideas from the Evolutionary Algorithm eld and, in particular, these descriptions include the model parameters of the mdp. We extend that work and demonstrate how a...
متن کاملIdentification of optimal policies in Markov decision processes
In this note we focus attention on identifying optimal policies and on elimination suboptimal policies minimizing optimality criteria in discrete-time Markov decision processes with finite state space and compact action set. We present unified approach to value iteration algorithms that enables to generate lower and upper bounds on optimal values, as well as on the current policy. Using the mod...
متن کاملValue Iteration and Action 2-Approximation of Optimal Policies in Discounted Markov Decision Processes
It is well-known that in Markov Decision Processes, with a total discounted reward, for instance, it is not always possible to explicitly find the optimal stationary policy f∗. But using the Value Iteration, a stationary policy fN such that the optimal discounted rewards of f∗ and fN are close, for the N -th iteration of the procedure, a question arises: are the actions f∗(x) and fN (x) necessa...
متن کاملLearning Qualitative Markov Decision Processes Learning Qualitative Markov Decision Processes
To navigate in natural environments, a robot must decide the best action to take according to its current situation and goal, a problem that can be represented as a Markov Decision Process (MDP). In general, it is assumed that a reasonable state representation and transition model can be provided by the user to the system. When dealing with complex domains, however, it is not always easy or pos...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: ACM SIGMETRICS Performance Evaluation Review
سال: 2015
ISSN: 0163-5999
DOI: 10.1145/2825236.2825239